Prosodic phrasing with inductive learning
نویسندگان
چکیده
Prosodic phrasing is an important component in modern TTS systems, which inserts natural and reasonable breaks into long utterance. This paper reports the study of applying several inductive machine-learning algorithms to prosodic phrasing in unrestricted Chinese texts. Two feature sets are carefully selected considering the effectiveness and reliability of them in practice. Then features and target boundary labels are extracted from a prepared speech corpus and used as training examples for inductive learning algorithms such as decision tree (C4.5), memory-based learning (MBL) and support vector machines (SVMs). The paper places emphasis on the comparison of the performance and speed of different learning techniques by training and testing them on the same corpus. The experiments show that all the algorithms achieve comparable results for both prosodic word and phrase prediction. It seems that prosodic word can be predicted from Chinese texts more accurately than prosodic phrase when using the same features and learning technique. Inductive learning is a promising way to prosodic phrasing, but it’s more important to find out good features than to apply different learning algorithms in order to improve the prediction accuracy dramatically.
منابع مشابه
A New Prosodic Phrasing Model for Chinese TTS Systems
This paper proposes a new prosodic phrasing model for Chinese text-tospeech systems. First, in contrast to the commonly used CART techniques, we propose a new inductive learning algorithm based on the extension matrix theory. Second, we collected 559 sentences (of approximately 78 min length) from news programs and built a corresponding speech corpus uttered by a professional male announcer. Th...
متن کاملA new prosodic phrasing model for indian language telugu
Prosodic phrasing is an important and more difficult a problem for Indian languages, as the Indian language scripts use very little or no punctuation. This paper reports a preliminary attempt on data-driven modeling of prosodic phrase boundary prediction for the Indian language Telugu. In an effort to identify meaningful features that affect the prosodic phrasing, a new feature, namely mopheme ...
متن کاملA Grammar Based Approach to Style Specific Phrase Prediction
We present an approach to style specific phrasing for Text-toSpeech (TTS) systems. We formulate the problem of phrase break prediction (or phrasing) as generation of a sequence of breaks (B) and non-breaks (NB) after each word in a sentence. We use prosodic breaks in speech data to build shallow parses over corresponding text. We then learn a grammar that can predict these shallow prosodic pars...
متن کاملTraining prosodic phrasing rules for Chinese TTS systems
This paper describes several experiments designed to train prosodic phrasing models for Chinese TTS systems and to investigate the underlying rules that control Chinese prosody. First, we collected 559 sentences from news programs and built a large corpus for modeling Chinese prosody. Second, we selected 20 features and used classification and regression trees (CART) and transformational rule-b...
متن کاملLearning PP attachment for filtering prosodic phrasing
We explore learning prepositionalphrase attachment in Dutch, to use it as a filter in prosodic phrasing. From a syntactic treebank of spoken Dutch we extract instances of the attachment of prepositional phrases to either a governing verb or noun. Using cross-validated parameter and feature selection, we train two learning algorithms, TB I and RIPPER, 011 making this distinction, based on unigra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002